Detecting a Tweet's Topic within a Large Number of Portuguese Twitter Trends
نویسندگان
چکیده
In this paper we propose to approach the subject of Twitter Topic Detection when in the presence of a large number of trending topics. We use a new technique, called Twitter Topic Fuzzy Fingerprints, and compare it with two popular text classification techniques, Support Vector Machines (SVM) and k-Nearest Neighbours (kNN). Preliminary results show that it outperforms the other two techniques, while still being much faster, which is an essential feature when processing large volumes of streaming data. We focused on a data set of Portuguese language tweets and the respective top trends as indicated by Twitter. 1998 ACM Subject Classification I.2.7 Natural Language Processing, H.2.8 Database Applications, I.5.4 Applications
منابع مشابه
Identifying interesting Twitter contents using topical analysis
Social media platforms such as Twitter are becoming increasingly mainstream which provides valuable user-generated information by publishing and sharing contents. Identifying interesting and useful contents from large text-streams is a crucial issue in social media because many users struggle with information overload. Retweeting as a forwarding function plays an important role in information p...
متن کاملOn-line Trend Analysis with Topic Models: \#twitter Trends Detection Topic Model Online
We present a novel topic modelling-based methodology to track emerging events in microblogs such as Twitter. Our topic model has an in-built update mechanism based on time slices and implements a dynamic vocabulary. We first show that the method is robust in detecting events using a range of datasets with injected novel events, and then demonstrate its application in identifying trending topics...
متن کاملThis is your Twitter on drugs: Any questions?
Twitter can be a rich source of information when one wants to monitor trends related to a given topic. In this paper, we look at how tweets can augment a public health program that studies emerging patterns of illicit drug use. We describe the architecture necessary to collect vast numbers of tweets over time based on a large number of search terms and the challenges that come with finding rele...
متن کاملMaking Sense of Microposts at Scientific Conferences
Twitter is being widely used at scientific conferences. Following the microblogging stream, however, adds to the cognitive load of a conference participant. Therefore, there is a need for means of extracting the most important topics from a Twitter stream. This demo paper presents an adaptable system for detecting trends based on Twitter, and shows how it can be used within the setting of a con...
متن کامل2016 Olympic Games on Twitter: Sentiment Analysis of Sports Fans Tweets using Big Data Framework
Big data analytics is one of the most important subjects in computer science. Today, due to the increasing expansion of Web technology, a large amount of data is available to researchers. Extracting information from these data is one of the requirements for many organizations and business centers. In recent years, the massive amount of Twitter's social networking data has become a platform for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014